258 research outputs found

    Selecting the rank of truncated SVD by Maximum Approximation Capacity

    Full text link
    Truncated Singular Value Decomposition (SVD) calculates the closest rank-kk approximation of a given input matrix. Selecting the appropriate rank kk defines a critical model order choice in most applications of SVD. To obtain a principled cut-off criterion for the spectrum, we convert the underlying optimization problem into a noisy channel coding problem. The optimal approximation capacity of this channel controls the appropriate strength of regularization to suppress noise. In simulation experiments, this information theoretic method to determine the optimal rank competes with state-of-the art model selection techniques.Comment: 7 pages, 5 figures; Will be presented at the IEEE International Symposium on Information Theory (ISIT) 2011. The conference version has only 5 pages. This version has an extended appendi

    Greedy MAXCUT Algorithms and their Information Content

    Full text link
    MAXCUT defines a classical NP-hard problem for graph partitioning and it serves as a typical case of the symmetric non-monotone Unconstrained Submodular Maximization (USM) problem. Applications of MAXCUT are abundant in machine learning, computer vision and statistical physics. Greedy algorithms to approximately solve MAXCUT rely on greedy vertex labelling or on an edge contraction strategy. These algorithms have been studied by measuring their approximation ratios in the worst case setting but very little is known to characterize their robustness to noise contaminations of the input data in the average case. Adapting the framework of Approximation Set Coding, we present a method to exactly measure the cardinality of the algorithmic approximation sets of five greedy MAXCUT algorithms. Their information contents are explored for graph instances generated by two different noise models: the edge reversal model and Gaussian edge weights model. The results provide insights into the robustness of different greedy heuristics and techniques for MAXCUT, which can be used for algorithm design of general USM problems.Comment: This is a longer version of the paper published in 2015 IEEE Information Theory Workshop (ITW

    Learning Dictionaries with Bounded Self-Coherence

    Full text link
    Sparse coding in learned dictionaries has been established as a successful approach for signal denoising, source separation and solving inverse problems in general. A dictionary learning method adapts an initial dictionary to a particular signal class by iteratively computing an approximate factorization of a training data matrix into a dictionary and a sparse coding matrix. The learned dictionary is characterized by two properties: the coherence of the dictionary to observations of the signal class, and the self-coherence of the dictionary atoms. A high coherence to the signal class enables the sparse coding of signal observations with a small approximation error, while a low self-coherence of the atoms guarantees atom recovery and a more rapid residual error decay rate for the sparse coding algorithm. The two goals of high signal coherence and low self-coherence are typically in conflict, therefore one seeks a trade-off between them, depending on the application. We present a dictionary learning method with an effective control over the self-coherence of the trained dictionary, enabling a trade-off between maximizing the sparsity of codings and approximating an equiangular tight frame.Comment: 4 pages, 2 figures; IEEE Signal Processing Letters, vol. 19, no. 12, 201

    Nonparametric Bayesian Image Segmentation

    Get PDF
    Image segmentation algorithms partition the set of pixels of an image into a specific number of different, spatially homogeneous groups. We propose a nonparametric Bayesian model for histogram clustering which automatically determines the number of segments when spatial smoothness constraints on the class assignments are enforced by a Markov Random Field. A Dirichlet process prior controls the level of resolution which corresponds to the number of clusters in data with a unique cluster structure. The resulting posterior is efficiently sampled by a variant of a conjugate-case sampling algorithm for Dirichlet process mixture models. Experimental results are provided for real-world gray value images, synthetic aperture radar images and magnetic resonance imaging dat

    Exact Recovery for a Family of Community-Detection Generative Models

    Full text link
    Generative models for networks with communities have been studied extensively for being a fertile ground to establish information-theoretic and computational thresholds. In this paper we propose a new toy model for planted generative models called planted Random Energy Model (REM), inspired by Derrida's REM. For this model we provide the asymptotic behaviour of the probability of error for the maximum likelihood estimator and hence the exact recovery threshold. As an application, we further consider the 2 non-equally sized community Weighted Stochastic Block Model (2-WSBM) on hh-uniform hypergraphs, that is equivalent to the P-REM on both sides of the spectrum, for high and low edge cardinality hh. We provide upper and lower bounds for the exact recoverability for any hh, mapping these problems to the aforementioned P-REM. To the best of our knowledge these are the first consistency results for the 2-WSBM on graphs and on hypergraphs with non-equally sized community
    • …
    corecore